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Abstract. The orthogonal multi-matching pursuit (OMMP) is a natural ex- 
tension of orthogonal matching pursuit (OMP). We denote the OMMP with 
the parameter M as OMMP(M) where M > 1 is an integer. The main dif- 
ference between OMP and OMMP(M) is that OMMP(A/) selects M atoms 
per iteration, while OMP only adds one atom to the optimal atom set. In 
this paper, we study the performance of orthogonal multi-matching pursuit 
(OMMP) under RIP. In particular, we show that, when the measurement ma- 
trix A satisfies (9s, 1/10)-RIP, there exists an absolutely constant Mq < 8 so 
that OMMP(Mo) can recover s-sparse signal within s iterations. We further- 
more prove that, for slowly-decaying s-sparse signal, OMMP(M) can recover 
s-sparse signal within 0{jj) iterations for a large class of M. In particular, 
for M = s° with a S [0, 1/2], OMMP(M) can recover slowly-decaying s-sparse 
signal within 0(s^~") iterations. The result implies that OMMP can reduce 
the computational complexity heavily. 

1. Introduction 

1.1. Orthogonal Matching Pursuit. Orthogonal matching pursuit (OMP) is a 
popular algoritlim for the recovery of sparse signals. The major features of OMP 
are its computationally simple and well implementation. And hence, it is also 
commonly used in compressed sensing. Let A be a matrix of size m x N and y be 
a vector of size m. The aim of OMP is to find the approximation solution to the 
following ^o-mininiization problem; 

min llxllo Ax. = y. 

In compressed sensing and the sparse representation of signals, we often have m <C 
N. Throughout this paper, wc suppose that the sampling matrix A G C»»x^ whose 
columns ai, . . . , are £2-iiormalized. 

To introduce the performance of OMP, we first recall the definition of the re- 
stricted isometry property (RIP) |6] which is frequently used in the analysis of 
the recovering algorithm in compressed sensing. Following Candes and Tao. for 
1 < < and 6k G [0, 1), we say that the matrix A satisfies (fc, (5fe)-RIP if 

(1) (i-4)||x||^<||Ax||2<(i + 4)||x||2 

holds for all fc-sparse x. We next state the definition of the spark (see also [1]). 

Definition 1. The spark of a matrix A is the size of the smallest linearly dependent 
subset of columns, i.e., 



Spark(^) min{||x|lo : Ax = 0,x 7^ 0}. 
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Theoretical analysis of OMP has concentrated primarily on two directions. The 
first one is to study the condition for the matrix A under which OMP can recover 
s-sparsc signals in exactly s iterations. In this direction, one uses the coherence 
and RIP to analyze the performance of OMP. In particular, Davenport and Wakin 
showed that, when the matrix A satisfies (s + 1, g^)-RIP, OMP can recover s- 
sparse signal in exactly s iterations [8|. The sufficient condition is improved to 
(s + l, ^7=:^)-RIP in [HlIIl] (see also [IHllIl] ). However, it was observed in p]. 
when the matrix A satisfies (cqs, (5o)-MP for some fixed constants cq > 1 and 
< So < 1, that s iterations of OMP is not enough to uniformly recover the s- 
sparse signals, which implies that OMP has to run for more than s iterations to 
uniformly recover the s-sparse signals. Hence, one investigates the performance of 
OMP along the second line with allowing to OMP run more than s iterations. For 
this case, it is possible that OMP add wrong atoms to the optimal atom set, but one 
can identify the correct atoms by the least square. A main result in this direction 
is presented by Zhang [20] with proving that when A satisfies (31s, i)-RIP OMP 
can recover the s-sparse signal in at most 30s iterations. 

The other type of greedy algorithms, which are based on OMP, have been pro- 
posed including the regularized orthogonal matching pursuit (ROMP) [M], sub- 
space pursuit (SP) [7], CoSaMP [15], and many other variants. For each of these 
algorithms, it has been shown that, under a natural RIP setting, they can recover 
the s-sparse signals in s iterations. 

1.2. Orthogonal Multi-matching Pursuit and Main Results. A more nat- 
ural extension of OMP is orthogonal mult i- matching pursuit (OMMP) [11]. We 
denote the OMMP with the parameter M as OMMP(M) where M > 1 is an inte- 
ger. The main difference between OMP and OMMP(M) is that OMMP(M) selects 
M atoms per iteration, while OMP only adds one atom to the optimal atom set. 
The Algorithm 1 outlines the procedure of OMMP(M) with initial feature set A°. 
In comparision with OMP, OMMP has fewer iterations and computational com- 
plexity [TO]. We note that, when M = 1, OMMP(M) is identical to OMP. OMMP 
is also studied in [10l[12l[18] under the names of KOMP, MOMP and gOMP, re- 
spectively. These results show that, when RIP constant S = 0{^J^), OMMP(M) 
can recover the s-sparse signal in s iterations. 

The aim of this paper is to study the performance of OMMP(M) under a more 
natural setting of RIP (the RIP constant is an absolutely constant). Moreover, we 
also would like to understand the relation between the number of iterations and 
the parameter M . So, we are interested in the following questions: 
Question 1: Does there exist an absolute constant Mq so that OMMP(Mo) can 
recover all the s-sparse signal within s iterations? 

Question 2: For 1 < M < s, can OMMP(il/) recover the s-sparse signal within 
O(-p-) iterations? 

We next state one of our main results which gives an affirmative answer to 
Question 1. 

Theorem 1. Let x e and S = supp(x). Suppose that the sampling matrix 
A e C™^^ satisfies {9s,^)-RIP and Spark(A) > Ms . Then OMMP(M) can 
recover the signal x within, at most, max{s', -jfs'} iterations, where s' := #(S'\A'^) 
and A" is the initial feature set. 
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Algorithm 1 OMMP(M) 

Input: sampling matrix A, samples y = Ax, candidate number M for each step, 
stopping iteration index H, initial feature set A° C {1, . . . , N} 
Output: the x*. 
Initialize: = y,x" = 0,^ = 0. 
while £ < H do 
match: h^ = A^r^ 

calculate: = AI indices corresponding to the largest magnitude 

entries in the vector 
identity: A^+i = U 
update: x^+^ = argmin ||y— $zj|2 

z:supp(z)CA^ + ^ 

= y - $x^+i 

£ = £ + 1 

end while 

X* = x^ 



The above theorem shows that, when M > 8, OMMP(A/) with the initial feature 
set A" = can recover all the s-sparse signal within, at most, s iterations. It implies 
that there exists an absolute constant Mq < 8 so that OMMP(Mo) can recover all 
the s-sparse signal within s iterations. We believe that the constant Mq = 8 is not 
optimal. The numerical experiments make us conjecture that the optimal number 
is 2, i.e., under RIP, 0MMP(2) can recover the s-sparse signal within s iterations. 

We next turn to Question 2. The following theorem shows that, when 1 < i\jf < 
y/s, OMMP(A/) can recover slowly-decaying signal within 0{jj) iterations. 

Theorem 2. Let x e C^. Suppose that S = supp(x) and s = #5. Set 

max|xj| 

Consider the OMMP(Af) algorithm with 1 < M < ^/s. Suppose that the sam- 
pling matrix A € C™^^ satisfies (9s, j^)-RIP and Spark(A) > 8(C^ + 2)s. Then 
OMMP(M) recover the s-sparse signal x within Cijj iterations with Ci ~ 8(Cq + 
2). 

The theorem above shows that, for 1 < M < -y/s, OMMP(M) can recover s- 
sparse signal within Ci iterations. Here, the constant Ci depends on the signal 
X. In particular, if we take M = [s^J in Theorem [2l we have 

Corollary 1. Under the condition of Theorem^ if M ~ [s^J with a E [0,1/2], 
then OMMP(Af) recover the s-sparse signal within 2(715^"" iterations with Ci = 
8(C2 + 2). 

We next consider the case with M = a ■ s. In particular, for 'small' a, we give 
an afhrmativc answer to Question 2 up to a log factor. 
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Theorem 3. Let x e C . Suppose that S = supp(x) and s ~ #5'. Set 



max X,- 



minimi 



Consider the OMMP(M) algorithm with M = a ■ s and Q < a < „ . Sup 



pose i/iai i/ie sampling matrix A £ ([^^nxN g^f^gjigg (13s, j^)-RIP and Spark(j4) > 
8slog2(2(s + 1)). Then OMMP(Af) recover the s-sparse signal x from y = Ax 
within ^log2(2(s + 1)) iterations. 

Remark 1. Wc prove the main results using some of the techniques developed by 
Zhang in his study of OMP [20] (see also [9]). To make the paper more readable, 
we state our results for the strictly sparse signal. In fact, using a similar method, 
one also can extend the results in this paper to the case where the measurement 
vector y is subjected to an additive noise and x is not strictly sparse. 

Remark 2. In [TT|, Liu and Tymlyakov proved that, when A satisfies (Mq, 5)-RIP 
with 5 = VMo/((2 + V2)yi), OMMP(Mo) can recover s-sparse signal within, at 
most, s iterations. The result requires the RIP constant 5 depends on s = ||x||o. In 
Theorem [U we require that the measurement matrix A satisfies (9s, (5)-RIP with 5 
being an absolutely constant 1/10. Hence, Theorem [1] gives an affirmative answer 
to Question 1 under the more natural setting for the measurement matrix A. 

Remark 3. It is of interest to know which matrices A obey the (fc, (5)-RIP and the 
Spark(yl) > K where X is a fixed constant. Much is known about finding matrices 
that satisfy the (fc,(5)-RIP (see [2ll4l[5|[T7l[T9]). If we draw a random m x N matrix 
A whose entries are i.i.d. sub-Gaussian random variables, then Spark(A) = m with 
probability 1 (see [HIS]). Moreover, the random matrix A also satisfies (fc, 5)-RIP 
with high probability provided 

'fcl og(iV/fc) '\ 

J ■ 

So, to make the random matrices A obey the (fc, (5)-RIP and the Spark(A) > K, 
one can take 

'Hog(iV/fc)\ 
m = max <! O | j 



O 



2. Numerical experiments 

The purpose of the experiment is the comparison for the reconstruction perfor- 
mances of and the iteration number of OMMP(M) with different parameter M. 
Given the parameters m and N, we randomly generate a mx N sampling matrix A 
from the standard i.i.d Gaussian ensemble. The support set S of the sparse signal 
X is drawn from the uniform distribution over the set of all subsets of [1, A'^] D Z of 
size s. We then generate the sparse signal x according to the probability model: the 
entries Xj , j S 5, are independent random variable having the Gaussian distribution 
with mean 5 and standard deviation 1. 

We apply the OMMP(M) to recover the sparse signal x from y = Ax for different 
parameters M £ {1, [^sj, [fj}. Note that when M = 1, OMMP(A/) is identical 
with OMP. We repeat the experiment 200 times for each number s G {1,2,..., 80} 
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and calculate the success rate. When OMMP successes, we record the number 
of the iteration steps. The left graph in Fig. 1 depicts the success rate of the 
reconstructing algorithm OMMP(M) with M <E {1, [^/s\, LfJ}- The number of the 
average iteration steps of OMMP(A'/) with M £ {1, [y/s\, [|J} are illustrated in 
the right graph in Fig. 1. The numerical results show that the performance of 
OMMP(M),M e {Lv^J, [fj}, is similar with that of OMP, while the number of 
iteration steps of OMMP(M), M e { [y/s\ , [f J }, is far less than that of OMP, which 
agrees with the theoretical results presented in this paper. 




Figure 1. Numerical experiments for the sparse signals. The 
left graph corresponds to the success rates of OMMP(A'f), Af G 
{1, [^/sj 1 LfJIi whereas the right one depicts the number of the aver- 
age iteration steps of OMMP(Af), Af G {1, [^J , [fJ}. 



3. Extension 

According to Theorem [2] and Theorem [Sj OMMP has a good performance for 
the slowly-decaying sparse signal x. Naturally, one may want to know whether 
OMMP(M) can recover all the s-sparse signal within less than s iterations for some 
M £ n Z. Numerical experiments show that, for some fast-decaying s-sparse 
signal X, OMMP(M) has to run at least s steps to recover x for any M G [1, s] n Z. 
However, as shown in [8], when the s-sparse signal x is fast-decaying, OMP has a 
good performance. To state the result in |5], we firstly introduce the definition of 
a-decaying signals. For any s-sparse signal x e C^, we denote by S the support of 
X. Without loss of generality, we suppose that S = {ji, . . . ,js} and 

|xjj > |x,J > • • • > |x,J > 0. 

For a > 1, we call the x a-decaying if |/|xjj^j | > a for all i g {1, 2, . . . , s — 1}. 

Theorem 4. ( [8]) Suppose that A satisfies (s + l,6s+i)-RIP with 5s+i < ^- Sup- 
pose that X with ||x||o < s is a-decaying signal. If 

(2) a > 



1 



then OMP will recover x exactly from y ~ Ax. in s iterations. 



In this paper, motivated by the proof of Theorem [TJ we can improve Theorem 2] 
as follows: 
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Theorem 5. Suppose that A satisfies {s, 6s)-RIP with Ss < — 1- Suppose that 
X e with II x||o < s is a-decaying. If 

then OMP can recover x exactly from y = Ax in s iterations. 

Remark 4. In Theorem [U the right side of © depends on RIP constant and 
s = ||x||o, while in Theorem [SJ the right side of only depends on the RIP 
constant. So, Theorem [S] is an improvement over Theorem 21 

Appendix A. Lemmas 

In this section, we introduce many lemmas, which extend some results in To 
state conveniently, for any set T C {1, . . . , N} of column indices, we denote by At 
the m X #r matrix composed of these columns. Similarly, for a vector x G C^, 
we use X.T to denote the vector formed by the entries of x with indices from T. 
Set no := LtJ- ^o^' u e and f e Z+, we extend the £i-norm to a generalized 
£i-norm defined as 



"""^ r~ / 

l|u||t,i + ■■■ + ^u+i)t + yK,t+i + ■■■ + <■ 

3=0 

Similarly, we also can extend the £oo-norm as follows 



Lemma 1. Suppose that the sampling matrix A G £^mxN ^ Suppose that A" C 
A"+i C {1, . . . , TV} and set T" := A"+^ \ A" with t := #r". Let 

x" := argmin ||y — ylz||2, 

supp(z)— A" 

(4) x"^"'" argmin ||y — Az||2, 

supp(z)— A" + ^ 

and 

V" := A^„(y-ylx"). 

Then 



||y-Ax"+i||^ < ||y-Ax"|p- ||V"||i 

1 + dt 



Proof. The definition of x"+^ implies that the residuality y — Ax"+^ is orthogonal 
to the space span(AAi.+i ). Noting A(x"+^ — x") G span(AAn+i), we obtain that 

(y - Ax"+\ A(x"+i - x")) = 0, 

which implies that 

||y-Ax"||2 = ||y-Ax"+i+A(x"+i-x")||^ 
(5) = ||y-Ax"+i||^ + ||A(x"+i-x")||2. 

Also, noting that (A^y)A"+i = (A" Ax"+'^)f,^+i , (A^y)A" = (A^Ax")a" and 
(x")t'' = 0, we have 

(A«A(x"+i - x"))a„+. ^ {A"iy - Ax"))a.+i, 
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since 

(A^^(x"+i - x"))i 
and 

(A«A(x"+i - x"))An - (A^y)A'^ - (A^Ax")a'^ = (A«(y - Ax"))a" 
To this end, we consider 

(6) P(x"+i - x")||2 = (x"+^ - x", A^A(x"+i - x")) 

= (x"+i - x", (yl^A(x"+i - x"))a„+i) 

= ((x"+i-x")T.,(A^(y-Ax"))T.) 

= ((x"+i)T.,(A«(y-Ax"))T.) 

= ((A^(y-Ax«))T.,(x"+i)T.) 

= (V",(x"+1)t.). 



According to Q, 



A 



A"+i 



y, 



where A^„^i 
And hence 



(A^„+iAa"+i) ^A^„+i istheMoore-Penrosepseudoinverseof 



X"— - (A^„+iAA„+i)"iA^„+iy. 
We can write Aa"+i as Aa^+i = [Aa", A^"]- Then 



^A"+i^A" + i — 



We next consider 
where 

Noting that 
we obtain that 



Ml M2 
M3 M4 



= ( A|^„ - Aa" a];„ ) " \ 

M3 = -A/4(A|(„AA.)(Af„AA.)-l. 



(«„+iAA.+i)-M^„+iy)^„ 



= -Af4(A|(„AA.)(A^„ AA-O-'^A-y + M^A^r^y 

= Af4A|(„(-AA"(AA"AA.0"'^f"y + y) 

= M4 A|(„ (- Aa" A+„ y + y) 

= M4A|(„ (y - Ax") 

= M4V". 

Then (jS]) imphes that 

(7) (V")^A/4V" = (V", (x"+1)t") - ||A(x"+^ - x")||2. 
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To this end, we consider u^M^^u, for any u e C*. Note that 

= u"A^„ At^ u - (At- u, Paa. (^t- u)) 
= \\At^u\\1-\\Pa,^{At^u)\\1 
(8) < {l + St)\\n\\l 

where PA^„(ATnu) denotes the orthogonal projection of A^nU in the subspace 
span(AAn). In the last inequality, we use the RIP property of A. Noting that 
IIPaa" (^T"u)||2 < ||A7'nu||2, (HI) implies that M4 is a positive-definite matrix. 
Combining Q and (|S]), we obtain that 



ni\2 



M(x"+i -x")||2 = (V")^A/4V" > ^--IIV 

1 + Ot 

The dSD implies that 

\\y-A^"+Y2 = |ly-Ax"|l2-p(x"+i-x")|l 
< ||y-Ax"+i||2- 1 ||V"||i 



□ 



Lemma 2. Consider OMMP(M) and A" C A"+i C {1, . . . , iV}. 5et T" A"+i \ 
A" and t :~ ^T" . Suppose that the sampling matrix A G C™^^ whose columns 
ai, . . . , UN are £2-normalized. Then for any u € whose support U not included 
in A", we have 

^„„2 ^ ||A(u-x")||i(||y-Ax"||2_||y_^u||2^ 



|V"|1:^ > 



|ua^II?.i 



where V" = (y - Tlx"). 

Proof. To this end, we only need prove that 

l|V"i|^ • Wuj^Wl > \\Aiu x")||^ . (||y - ylx"||^ - ||y - An\\l) . 

When 

||y-Ax"||^-||y-Au||2<0, 
the conclusion holds. So, we only consider the case where 

||y-Ax"||2_||y_Au!|2>0. 

Recall that T" is t indices corresponding to the largest magnitude entries in the 
vector {A"{y - ^x"))^. Then 



||V"||2 > ||(A«(y-^x«)) 
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We consider 

||V"||2-||u-||m > ||(^«(y-Ax"))^||*,^.||(u-x")^||,,l 

> 7^((u-x")^,(A«(y-Ax"))^) 

= 7e((u-x"),A^(y-Ax")) 

= 7^(A(u-x«),(y-Ax")) 



" M I 2 
2 



ri M I 2 
2 



"11 2 
2 

n\\2 



|y- Ax"||2_ j|y_^u|| 



> ||A(u-x")!|2. 
which implies the result. 

Lemma 3. Under the conditions of Lemma\^ we have 
(9) 

,n+l||2 ^ II,, /l„n||2 (1 ^ 



|yl(u-x")-(y-Ax") 



^ull 



□ 



|y-Ax"+^||^<||y-^x"||^- 



#(^\A" 



max{0,||y-Ax"||^-||y-Au||^}, 



where S — 6^u\jA" ■ 

Proof. According to Lemma [1] and Lemma [21 we have 

1 



I|y-^x"+'ll2 



< 



< 



|y-^x"ii2- 



1 



Ax 



,ri 1 1 2 



|A(U-X")||2(||y- 



il + 6t)\\uj^\\i. 



The Cauchy-Schwarz inequality implies that 



^A"llia 



< 



#(C/\A") 



Also, 



|lA(u-x")|l^ > (l-5)|lu-x"|l^ 

> (l-5)i|(u-x")pr||^ 



> (l-<5)||u— 
Combing the results above, we arrive at the conclusion. 



□ 



Remark 5. Lemma [3] extends some results in |5, where Foucart considered the 
case with t = 7^(A"+^ \ A") = 1, to the general case. In fact, if takes t = 1 in 
Lemma [21 one can obtain Lemma 4 in [S]. 



Appendix B. Proof of Theorem [H 

Proof of Theorem\^ To state conveniently, we set x' x-^ and K :~ max{s', jjs'} 

To this end, we only need prove that S C A^, i.e. fj^S \ ~ 0. The proof is by 
induction on s' . If s' = 0, then the conclusion holds. For the induction step, we 
assume that the result holds up to an integer s' — 1. We next show that it holds 
for s'. 
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Without loss of generality, we suppose that 

|x'i|>|x^|>--->|xl,|>0. 
For £=!,..., max{0, [log2 |j] } + 1, we set 

if J > 2^-1 •M + 1, 
'~ [0 else, 

and := x'. Suppose that i G Z such that 

(10) ||ic"||2<^!|il|j2,...,||x^-2||2<^j|5,L-l||2 

and 

(11) \\i^-'\\l>^,\\i^\\l 

And hence, L is the least integer such that ||x^~^||2 > A'llx'^Hi and we will choose 
fi > 2 lately. The existence of such a L can follow from ||x^||2 = when £ = 
maxjO, [log2 Ij]} + 1. And hence, we have 



1< L < max <^ 



1. 



Wc first consider the case where L ~ 1. We take u = = x — x^ and i = M in 
([9]). Then a simple observation is that 

#supp(ui) = #A° + min{A/, s'}. 

Then, 

#(supp(ui) \ A°) = min{Af, s'}. 
Noting that p#(£HPPiHiM!)] = 1 and 

\\y-Au'\\l^\\A^-Au'\\l^\\Ak'\\l 
by (O, wc can obtain that 

max{0, !|y - A^'g - \\AS,X} < (l - niax{0, \\y - A^Y2 - USc^j}, 

which implies that 

l|y-^xi||^ < (l-l-^]\\y-AA\l + \\AS.Y2 



1 + 5, 
1 - (5, 



< (l+Ss) I- 



2 

1 - Ss 
1 + Ss 



C,0||2 I II ~ 111 2 



< 26M'\\l + ^U'\\l 



2S. + l±^ 



On the other hand, we note that 



ur,0ii2 



ly-Axi^ > p(x-xi)r 



2 

1 1|2 



> (1-<52.)||X-X^||^ 

> il-S,s)\\^^\\l 
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Then, combining the two inequahties, we obtain that 



J2s 

Noting Ss < 62s < ^9s < tT)' have 

l + S. 



1-3(52 



<2</i, 



which imphes that 
And hence, 



1 - \ M 



I.e. 



#(5\Ai)<s'-l. 

Using the induction assumption, wc can recover the s-sparse signal x in I + n 
iterations where n = max{s' — 1, jj{s' — 1)}. Then, the conclusion follows since 

8 8 
1 + n = 1 + max{s' - 1, — (s' - 1)} < max{s', j^s'}. 

We next consider the case where L > 2. We take u = := x — and t ~ M 
in Then a simple observation is that 

#supp(u^) = #A° + niin{2^-iA/, s'}. 

Then, for any n > 0, 

#(supp(u'') \ A") = #A" + min{2^-iil/,s'} - #(supp(uO n A") 
< min{2^"iM,s'}. 
To state conveniently, we set 

^ mm{2'-'M,s'} 
■ ' M 

Noting that 

||y-Au^|l^ = |lAx-^u^|l^ = |lAx^||^, 

we obtain that 

max{0,||y-Ax«+i!|^-Px^||^} < (^1 - J_|±!i^) niax{0, !|y - Ax"!|^ - || Ai^j]^} 

(12) < exp(^-^i^|^±^)max{0,||y-Ax"||^-Pi^||^}. 

Iterating (|12p k times leads to 

r,+fc||2 _ II ^^^1121 ^ f u ^-^s+nM \ rn |, ^ _ . n||2 



max{0, !|y - Ax«+^-||^ - \\Aic'\\i} < exp ^^k^^-^^^^^ j max{0, ||y - - Px'^H^} 

which implies that 

i|y-Ax"+^||^ < exp f-k ^l :^/T'L ) \\y-A^m - \\Ai.^\\l} + ^±^2 



(1 + Sm) ■ 

(13) < c.p[-k^^^^^)\\y-A^m + \\Air2 
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where k and K are integers satisfying K > n + k. To state conveniently, for £ 



i-s. 



1 . . . , L, we set kg :^ k- U , fco 0, K := ki + ■ ■ ■ + k^ and := exp — fj^^ 

and we will choose k lately. For £ = 1, . . . , L, we take n := fco + ■ ■ ■ + ^f-i and 
fc := fcf in (jl3p and arrive at 

||y - Ax'=i+-+'=^^||2 <v\\y- ^x'=i+-+'='-i ||2 + Px^H^. 

Then, using the inequality for L times, we can obtain that 

||y-ylx^|l^ < i.^||y-^xO||2 + ... + ^px^"i||2 + j|Ai^j|2 

< v^\\AiP\\l + --- + v\\Ai^-'\\l + \\Ai^\\l. 

Here, for the second relation, we use the fact of 

,--,0m|2 _ II /(c,0| 

2 



|y - Ax"||2 = min _ J|y - Az||2 < ||y - A(x - = pi""^ 



supp(z)cA'' 

with supp(x — x") C A". Combining RIP property of A, ([TOl) and (fTT]) . we obtain 
that 

Mi'||^<(l + '5.)|li^|lB(l + <5.V^-i-^||i^-i||^ 
for £ 0, 1, . . . ,i. Note that 



|y-Ax^||^ < ^.^"^Px 



L 

2 





< 



(14) < 
and 



(i + ^s)||x^-^||i 



||y-^x^||^ > P(x-x^)||2 
(15) > (1- Wm)||x-x^-||2 

> (1- WA/)||xpr||i 
Combining ([T4| and ([T5|) . we obtain that 

(i-W.m)Mi-h" 

We can choose fc = 2, /i = and (5s+A' j\f < <^9s < tlJ with 
K ^ki + --- + kL<2^k< 8-^. 

Noting that i/ < exp(— 18/11) and // = ^ > 2, we have 

(l + <5,) 



(1 - Ss+K.M)fJ.{l - ^iv) 

which implies that 



< 1, 



\l<\\^'-'\\l 



As a result, after K iterations, we have 

#(5 \ A^) < #((5 \ AO) \ U^-^) -l = s' - 2^-2 . _ 1^ 
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with 

if = fci H h fci < 2^fc. 

Now wc continue the algorithm from iteration K. According to the induction 
assumption, we can recover the s-sparse signal x in K + n iterations where 

Q 

n = max{s' - 2^"^ • A/ - 1, — (s' - 2^"^ . _ ^^^j.^ 
Noting K + n < max{s', jjs'}, we arrive at the conclusion. 

□ 

Appendix C. Proofs of Theorem [2] and Theorem [3] 

To prove Theorem [5] and Theorem [31 we first introduce two lemmas. 

Lemma 4. Consider the OMMP(M) algorithm with 1 < Af < s. Suppose that the 
sampling matrix A G C""^^ satisfies (9s, jq)-RIP. Suppose that x is s-sparse and 

max|xj| 



mm x," 



Set s' := #{S \ AO) and K -.^ + 8{C^ + 1)M. Then #{S \ A^) = 0. 

Proof. To state conveniently, we set x' := x-^ and C2 := -jj-^ + 1. We will choose 
/i > 2 lately so that C2 < + 1. To this end, wc wiU prove that #(S' \ A^i) = 
with Ki = 8jj + 8C2M, which implies the result. The proof is by induction on 
s' = A°). We first prove the theorem for the case where s' < C2M . According 

to Theorem[Tl 0MMP(A4") recover the s-sparse signal within 8C2M < 8{C§ + 1)M 
iterations, which implies the result. 

We next consider the case where s' > C2M . Without loss of generality, we 
suppose that 

|xl|>|xi|>--->|x;,|>0. 
For convenience, for ^ = 1, . . . , [log2(-fj)] + 1, we set 

,f ^ fx; if 2^-iA/ + 1 < J, 
[0 else. 

and x" := x'. Suppose that L e Z such that 

(16) ||iO||^<Ml|i^||i...,||i^-^il^<A^i|i^-^||^ 
and 

(17) \\k^-'\\l>^,\\^^\\l 

And hence, L is the least integer such that ||x^^"'^||2 > /i||x^j|2. The existence of 
such a L can follow from ||x^||2 = when I = \\0g2 jj] + 1. The assumption of 
s' > C2Af implies that ||x°||2 < /^Hx^Hj. And hence, we have 2 < i < \\0g2 I7I +1- 
We take 

u = u := X — X 
and t — M va. Then a simple observation is that 

#supp(uO = #A0 + min{2^-iA/, s'}. 
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For any n > 0, 

#(supp(uO \ A") = #A" + min{2^-iA/,s'}-#(supp(u0nA") 
< min{ iM,s'}. 

To state conveniently, we set 

min{2^-iAf, s'}' 



Noting that 



M 



\\y-Ax/\\l = \\A^-AM%l = \\AA\l 
by (O, we obtain that 

max{0, ||y - Ax"+i||^ - \\Ai%l] < [l - n^axjO, ||y - ^x"||^ - \\Ai%l} 



(18) < e^v[-^^^^f^,y^^AUy-A^^r2-\\Aili}- 

Iterating (|18p for k times leads to 

max{0, ||y - A^-+^\\l U^'WD < cxp (-^{fqif^ffl) "^^^iO' ^^"H' - U^l}^ 
which implies that 

(19) ||y - A^^+'^Wl < exp (^-fc|i_ifi0) ||y - Ax"||^ + 

where fc and K are integers satisfying A' > n + k. 

To state conveniently, for £ = 1 . . . ,L, we set kg := k ■ U^, K := ki + ■ ■ ■ + k^ 
and 

1 — <5s+KJ\/ 



and we will choose k lately. We use and a similar argument in the proof of 
Theorem [T] to obtain that 

\\y-A^^<\\l < Y^v^-'WAir^ 



< 



(l + <5,0||i^-i||i ^ 



Note that 

(21) lly-Ax^ll^ > P(x-x^)|l^ 

> (l-d%+KA/)l|x-X^^||2 

> {1- 6s+KAl)\\xj^\\l. 

Combining ([20)) and ([2T|) . we arrive at 



(i-Wm)Mi-h" 
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We can choose k = 2, fi = and 

1 



Ss+KM < <59s < — . 



And hence v < exp(— 18/11) and 



/i = ^ > 2. 



Here, we use s + KM < s + iks' < 9s with 

K = ki + --- + kL<k{l + --- + 2^-^) 
- s' 

< 2^k<4-k - —. 
~ ~ M 

Then 

(1 + S,) 



< 1, 



(1 - 5s+km)/-i{1 - fJ-v) 
which impHes that 

I|xx7rll^<||i^-^||i 
As a resuh, after K iterations, we have 

#iS \ A^') < #((5 \ AO) \ C/^-i) = s' - 2^-^M, 

with 

K = ki + ■ ■ ■ + kL < k{l + ■ ■ ■ + 2^-1) < 2^k. 

Now we continue the algorithm from the iteration K. According to the induction 
assumption, we can recover the s-sparsc signal x in + n iterations where 

n < 8 — + 8C2M. 

Note that L>2 and 

2^fc + sfl^^!:!^ = 8 . 2^-2 + 8— - 8 • 2^-2 = 8—. 

M M M 

Then we arrive at 

A" + n < 8^ + 8C2M, 

which implies the result. □ 

Lemma 5. Suppose that x is s-sparse and 

maxlxi I 



mm a; 



Consider the OMMP(M) algorithm with 1 < M < s. Suppose that the sam- 

pling matrix A S C'^x-'V i^jfiQgp columns ai, . . . , are £2-normalized, and that A 
satisfies {Us, jq)-RIP. Set s' := A°) and 

s' s 
K ■=8— +A-ln2- — \ogJs' + 1). 

Then 4f^{S\K^) = 0. 
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Proof. To state conveniently, we set x' := x^. The proof is by induction on 
s' #(5" \ A°). When s' = 0, the conclusion holds trivially. 
Without loss of generality, we suppose that 

|x;i>|x^l>--->lx',,|>o. 

For convenience, for ^ = 1, . . . , [log2( jg-)] + 1, we set 



if 



^ lO else 

and x*^ := x'. Similar with the proof of LemmalU suppose that L is the least integer 
such that ||x'^^^||2 > mII^^III- We wiU choose fi > 2 lately. The assumption of M < 
(jij^2 ^ implies that ||x°||2 < ^||x^||2. And hence, we have 2 < L < \\0g2 JJ^ + 1. 
We take u = := x — x^ and t = M in ([9|). Then a simple observation is that 



#supp(u'') = #A" + min 



s 



For any n > 0, 



#(supp(uO \ A") = #A" + min| 2'-i^s' , s'j - #supp((uO n A") 



< min 
To state conveniently, we set 



2'-M,' 
s 



, s 



{l2^-^fs'\,s'} 
M 



kg := k ■ [/^, K := ki + ■ ■ ■ + k^ and v := cxp {^~k ^ i^g'^^"' j , and we will choose 

k lately. We use and a similar argument in the proof of Theorem [1] to obtain 
that 



ly-^x 



Kl\2 
2 



(22) 

Note that 



< 



< 



< 



^v^-^llAx 



(l + '5.)||x^-i||i 



(l + 5,)l|x^-'l" 



||y-Ax^|l^ > P(x-x^)|l^ 

> (l-<5.+A'M)l|x-x^i|l 
(23) > (l-J.+A'A/)l|xpr||2- 

Combining ([22]) and ([23]) . we arrive at 

I1„ l|2 < C^+Ss) ||~L-1||2 



(1 - Ss+Km)K^ - tJ'V) 
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We can choose = 2, /i = and Ss+km < <^i4s < jq- And hence v < cxp(— 
and = > 2. Here, we use s + KM < 13s with 

K = ki^ h kL 

J _ _s' _ 

< k{l + --- + 2^-^)— + kL<2^k-+kL 

s s 

< 4-k^ + -kL<8^+A + 2log,^. 



Then 



which impHcs that 



(1 - (5s+/^j\/)/i(l - fJ-v) 



\^\k\\2 < 11^ II 2- 



As a result, after K iterations, we have 

#(^\A^) < #((^\A")\C/^-i)-l 



= s' — -1, 

s 



with 



A' = fci + --- + fcL < k{l + ■ ■ ■ + 2^-^)- + kL 

s 

- s' 

< 2^k- + kL. 

s 

Now we continue the algorithm from the iteration K. According to the induction 
assumption, we have 

#(5\A^'+") = 

with 

n = 8 — ^+4-ln2- — log, s' -2^-^ — s' 

M M s 



Note that L > 2 and that 

/ ; nT 9 Af I 1 

S 



s' ~2^'^^s' -I 



2^- +8 7T^ < - 

s M ~ M 



A simple calculation shows that 



fci + 4 • In 2 • — • log2 [s -2 



s - s ^ M 

= 4 . In 2 . — • log2 s' + fci + 4 • ln2 • — log2 f 1 - 2^-2_ 
<4.1n2~log2(s' + l). 



Then we arrive at 



i^ + «<8|^+4.1n2~. log2(s' + 1), 



which implies the result. □ 
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Proof of Theorem[M According to Lemma lU after OMMP(M) rmming K steps, 
we have 

S C A'^ 

where 

K^8^ + S{C^ + 1)M < 8{C^ + 2)^. 
Since OMMP(A/) chooses M atoms at each iteration, we have 

#A^' < KM < 8{C'^ + 2)s. 

^0 



Noting that Spark(A) > 8(C^ + 2)s, we obtain that 



argmin ||j4z — y||2=x 

z6C",supp(z)cA-* 

which imphcs that OMMP(A/) can recover the s-sparse signal x within 8{Cq + 2)jj 
iterations. □ 

Proof of Theorem\^ By Lemma [51 we have 

S c A^^, 

since 

K = 8— +4-ln2-— log,(s + l) 



Also, noting that 
and 

we obtain that 



< 8Aiog2(2(s + l)) = -log2(2(s + l)). 
M a 



#A^^' < KM < 8slog2(2(s + 1)) 
Spark(A) > 8slog2(2(s + 1)), 



argmin || j4z — y||2 = x, 

z6C",supp(z)cA* 

which implies the result. □ 

Appendix D. Proof of Theorem O 

Proof. The proof proceed by induction. Hence, we assume that A^ C supp(x) holds 
for £ = 0, . . . , n — 1 < s — 1. We next consider A". Set 

X x^r.-iu{j„-i}, u XAn-iu{j"-i} 

where j"^^ is the indices of the largest entries of x ^„_i in magnitude. Lemma [3] 
implies that 

||y - Ax"||^ < ||y - Ax"-i||2 - max{0, ||y - A^-'g - \\y - Au\\l}, 

where U := supp(u). Noting that \ A"^^) = 1, we have 
||y-^x"||2 < ||y-Ax"-i||2-(l-5„)max{0,||y-Ax"-i!|2-||y-Au||2} 
< ||y - Ax"-i||2 _ (1 _ 5„)niax{0, ||y- Ax"-i||2 - Px«-i||2}. 

We claim that 
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Then we have 



ly-^x"!|2 < ^„||y-ylx"'i||2 + (l-5,)||Ai"-i|i2 



2 

= 5,P(x-x"-i)||2 + (l-5,)||Aic"-i||2 

^A 



< Ssil + <5.)||x— r||2 + (1 - Ss){l + 



< {l + 6s)[6s+^—^)\\^j^n2 



2 



Here we use the fact of ||x" ^||2 < ||x . „_i \^ jo? with x being a-decaying. On the 



A^ 

other hand, we have 



|ly-Ax"|l^ = P(x-x")||2 

> PXA,.||2 

> (l-<5,)||x^||l. 
Combing the results above, we obtain that 

!|x^!l^ < /3||x^ll^ 

where 



When OL> J 2-(i+^ )^ ' have /3 < 1. And hence, 



X-rirll9 < l|X7 



which imphes that A" C supp(x). 

To this end, we remain to argue that 



We assume that 



||y-Ax"-i||2 > 



|y-Ax"-i||2<pi"-i||2, 



and we shall derive a contradiction. The RIP property of the matrix A implies that 

(1 - 5,)i|xp^||2 < ||y - Ax"-lj|^ < < (1 + 

And hence, 

"2 < i±4lli"-i||?. 



||-A,.-iil2- 

Noting that Q!^||x"^^||| < ||x ^„_i |||, we have 

a < 



1 - <5., ' 



9 1 — 5 

which contradicts with or > 2-{i^6 y ' 
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